A diskless node (or diskless workstation) is a workstation or personal computer without disk drives, which employs network booting to load its operating system from a server. (A computer may also be said to act as a diskless node, if its disks are unused and network booting is used.)
Diskless nodes (or computers acting as such) are sometimes known as network computers or hybrid clients. Hybrid client may either just mean diskless node, or it may be used in a more particular sense to mean a diskless node which runs some, but not all, applications remotely, as in the thin client computing architecture.
Advantages of diskless nodes can include lower production cost, lower running costs, quieter operation, and manageability advantages (for example, centrally managed software installation).
In many universities and in some large organizations, PCs are used in a similar configuration, with some or all applications stored remotely but executed locally—again, for manageability reasons. However, these are not diskless nodes if they still boot from a local hard drive.
Contents |
Diskless nodes process data, thus using their own CPU and RAM to run software, but do not store data persistently—that task is handed off to a server. This is distinct from thin clients, in which all significant processing happens remotely, on the server—the only software that runs on a thin client is the "thin" (i.e. relatively small and simple) client software, which handles simple input/output tasks to communicate with the user, such as drawing a dialog box on the display or waiting for user input.
A collective term encompassing both thin client computing, and its technological predecessor, text terminals (which are text-only), is centralized computing. Thin clients and text terminals can both require powerful central processing facilities in the servers, in order to perform all significant processing tasks for all of the clients.
Diskless nodes can be seen as a compromise between fat clients (such as ordinary personal computers) and centralized computing, using central storage for efficiency, but not requiring centralized processing, and making efficient use of the powerful processing power of even the slowest of contemporary CPUs, which would tend to sit idle for much of the time under the centralized computing model.
Centralized computing or Thin client |
Diskless node | Fat client | |
---|---|---|---|
Local hard drives used | No | No | Yes |
Local general-purpose processing used | No | Yes | Yes |
The operating system (OS) for a diskless node is loaded from a server, using network booting. In some cases, removable storage may be used to initiate the bootstrap process, such as a USB flash drive, or other bootable media such as a floppy disk, CD or DVD. However, the firmware in many modern computers can be configured to locate a server and begin the bootup process automatically, without the need to insert bootable media.
For network auto-booting, the Preboot Execution Environment (PXE) or Bootstrap Protocol (BOOTP) network protocols are commonly used to find a server with files for booting the device. Standard full-size desktop PCs are able to be network-booted in this manner with an add-on network card that includes a UNDI boot ROM. Diskless network booting is commonly a built-in feature of desktop and laptop PCs intended for business use, since it can be used on an otherwise disk-booted standard desktop computer to remotely run diagnostics, to install software, or to apply a disk image to the local hard drive.
After the bootstrapping process has been initiated, as described above, bootstrapping will take place according to one of three main approaches.
This third approach makes it easier to use client OS than having a complete disk image in RAM or using a read-only file system. And it is compatible with most of modern OSes (Linux and Windows, knowing that Windows cannot live on read-only media...). In this approach, the system uses some "write cache" that stores every data that a diskless node has written. This write cache is usually a file, stored on a server (or on the client storage if any). It can also be a portion of the client RAM. This write cache can be persistent or volatile. When volatile, all the data that has been written by a specific client to the virtual disk are dismissed when said client is rebooted, and yet, user data can remain persistent if recorded in user (roaming) profiles or home folders (that are stored on remote servers). The two major commercial products (the one from Hewlett-Packard, and the other one from Citrix Systems) that allow the deployment of Diskless Nodes that can boot Microsoft Windows or Linux client OS use such write caches. The Citrix product cannot use persistent write cache, but VHD and HP product can.
Windows 3.x and Windows 95 OSR1[3] supported Remote Boot operations, from Netware servers,[4] Windows NT Servers[5] and even DEC Pathworks servers.[6]
Third party software Vendors such as Qualystem (acquired by Neoware), LanWorks (acquired by 3Com), Ardence (acquired by Citrix),APCT[7] and Xtreamining Technology[8] have developed and marketed software products aimed to remote-boot newer versions of the Windows product line: Windows 95 OSR2 and Windows 98 were supported by Qualystem and Lanworks, Windows NT was supported by APCT and Ardence (called VenturCom at that time), and Windows 2000/XP/2003/Vista/Windows 7 are supported by Hewlett Packard (which acquired Neoware which had previously acquired Qualystem) and Citrix Systems (which acquired Ardence).
With essentially a single OS image for an array of machines (with perhaps some customizations for differences in hardware configurations among the nodes), installing software and maintaining installed software can be more efficient. Furthermore, any system changes made during operation (due to user action, worms, viruses, etc.) can be either wiped out when the power is removed (if the image is copied to a local RAM disk) such as Windows XP Embedded remote boot[9][10] or prohibited entirely (if the image is a network filesystem). This allows use in public access areas (such as libraries) or in schools etc. where users might wish to experiment or attempt to "hack" the system.
However, it is not necessary to implement network booting to achieve either of the above advantages - ordinary PCs (with the help of appropriate software) can be configured to download and reinstall their operating systems on (e.g.) a nightly basis, with extra work compared to using shared disk image that diskless nodes boot off.
Modern diskless nodes can share the very same disk image, using a 1:N relationship (1 disk image used simultaneously by N diskless nodes). This makes it very easy to install and maintain software applications: The administrator needs to install or maintain the application only once, and the clients can get the new application as soon as they boot off the updated image. Disk image sharing is made possible because they use the write cache: No client competes for any writing in a shared disk image, because each client writes to its own cache.
All the modern diskless nodes systems can also use a 1:1 Client-to-DiskImage relationship, where one client "owns" one disk image and writes directly into said disk image. No write cache is used then.
Making a modification in a shared disk image is usually made this way:
1. The administrator makes a copy of the shared disk image that he/she wants to update (this can be done easily because the disk image file is opened only for reading)
2. The administrator boots a diskless node in 1:1 mode (unshared mode) from the copy of the disk image he/she just made
3. The administrator makes any modification to the disk image (for instance install a new software application, apply patches or hotfixes...)
4. The administrator shutdowns the diskless node that was using the disk image in 1:1 mode
5. The administrator shares the modified disk image
6. The diskless nodes use the shared disk image (1:N) as soon as they are rebooted.
The use of central disk storage also makes more efficient use of disk storage. This can cut storage costs, freeing up capital to invest in more reliable, modern storage technologies, such as RAID arrays which support redundant operation, and storage area networks which allow hot-adding of storage without any interruption. Further, it means that losses of disk drives to mechanical or electrical failure—which are statistically highly probable events over a timeframe of years, with a large number of disks involved—are often both less likely to happen (because there are typically less disk drives that can fail) and less likely to cause interruption (because they would likely be part of RAID arrays). This also means that the nodes themselves are less likely to have hardware failures than fat clients.
Diskless nodes share these advantages with thin clients.
However, this storage efficiency can come at a price. As often happens in computing, increased storage efficiency sometimes comes at the price of decreased performance.
Large numbers of nodes making demands on the same server simultaneously can slow down everyone's experience. However, this can be mitigated by installing large amounts of RAM on the server (which speeds up read operations by improving caching performance), by adding more servers (which distributes the I/O workload), or by adding more disks to a RAID array (which distributes the physical I/O workload). In any case this is also a problem which can affect any client-server network to some extent, since, of course, fat clients also use servers to store user data.
Indeed, user data may be much more significant in size and may be accessed far more frequently than operating systems and programs in some environments, so moving to a diskless model will not necessarily cause a noticeable degradation in performance.
Greater network bandwidth (i.e. capacity) will also be used in a diskless model, compared to a fat client model. This does not necessarily mean that a higher capacity network infrastructure will need to be installed—it could simply mean that a higher proportion of the existing network capacity will be used.
Finally, the combination of network data transfer latencies (physically transferring the data over the network) and contention latencies (waiting for the server to process other node's requests before yours) can lead to an unacceptable degradation in performance compared to using local drives, depending on the nature of the application and the capacity of the network infrastructure and the server.
Another example of a situation where a diskless node would be useful is in a possibly hazardous environment where computers are likely to be damaged or destroyed, thus making the need for inexpensive nodes, and minimal hardware a benefit. Again, thin clients can also be used here.
Diskless machines may also consume little power and make little noise, which implies potential environmental benefits and makes them ideal for some computer cluster applications.
Major corporations tend to instead implement thin clients (using Microsoft Windows Terminal Server or other such software), since much lower specification hardware can be used for the client (which essentially acts as a simple "window" into the central server which is actually running the users operating system as a login session). Of course, diskless nodes can also be used as thin clients. Moreover, thin client computers are increasing in power to the point where they are becoming suitable as fully-fledged diskless workstations for some applications.
Both thin client and diskless node architectures employ diskless clients which have advantages over fat clients (see above), but differ with regard to the location of processing.